library(topGO)
library(knitr)   # to use kable()
library(limma)   # to use vennDiagram()
library(ggplot2)
TAG <- params$TAG
VAR <- params$VAR
ANNOTATION <- list(genes = '../2019-07-26/genes/annotation.txt',
                   isoforms = '../2019-07-26/transcripts/annotation.txt')
TAGDIR <- paste0('../2020-01-08/', TAG, '/')

Introduction

This is the enrichment analysis for isoforms regulated by hatching. Because I quantified the association of expression with hatching in two different ways, this document has two sections. Section Variance reports the enrichment analysis performed with isoforms ordered by the proportion of expression variance explained by hatching. Note that some isoforms with low variance may still be highly associated with hatching, even if the fold change between levels of this factor is low. Section Differential expression uses an ordering of isoforms based on the significance of the differential expression between levels of hatching, which does depend on fold change.

Reading the data

Functional annotation is in 2019-07-26. I will also upload two lists of isoforms, with either proportion of variance explained by hatching or p-value of differential expression test.

tagVariance <- read.table(paste0(TAGDIR, VAR, '_variance.txt'))
tagPValue   <- read.table(paste0(TAGDIR, VAR, '_pvalue.txt'))
annotation  <- read.table(ANNOTATION[[TAG]], col.names = c('tagname', 'goterms'))

To initialize the topGOdata object, I will need the gene list as a named numeric vector, where the names are the isoforms identifiers and the numeric values, either the portion of variance explained by hatching or the p-values of the differential expression test. The structure() function adds attributes to an object.

Variance <- structure(tagVariance[,1], names = row.names(tagVariance))
PValues  <- structure(tagPValue[,1],   names = row.names(tagPValue))
rm(tagVariance, tagPValue)

There are 33037 isoforms scored with a variance portion and a differential expression p-value. It should actually be the exact same isoforms. The annotation data frame has more than one GO term for every tag, separated by |. I need a named list.

head(annotation)
##          tagname
## 1 TCONS_00000002
## 2 TCONS_00000010
## 3 TCONS_00000011
## 4 TCONS_00000016
## 5 TCONS_00000017
## 6 TCONS_00000018
##                                                             goterms
## 1                                  GO:0008417|GO:0016020|GO:0006486
## 2                                  GO:0043130|GO:0005515|GO:0043161
## 3                       GO:0005840|GO:0015935|GO:0003735|GO:0006412
## 4 GO:0006355|GO:0003700|GO:0006357|GO:0043565|GO:0005634|GO:0032502
## 5 GO:0006355|GO:0003700|GO:0006357|GO:0043565|GO:0005634|GO:0032502
## 6 GO:0006355|GO:0003700|GO:0006357|GO:0043565|GO:0005634|GO:0032502
allgenes2GO <- strsplit(as.character(annotation$goterms), "|", fixed = TRUE)
names(allgenes2GO) <- annotation$tagname
rm(annotation)

There are 19735 isoforms with GO annotations. But the differential expression analysis includes many more isoforms. In order to include the not-annotated isoforms in the enrichment analysis, to see if annotated or not annotated isoforms are more or less often differentially expressed, I should attribute a GO term to them. According to [http://geneontology.org/docs/faq/] nowadays we express lack of annotation by annotating to the root nodes, i.e. GO:0008150 biological_process, GO:0003674 molecular_function, and GO:0005575 cellular_component.

for (tag in unique(c(names(PValues), names(Variance)))) {
   if (! tag %in% names(allgenes2GO)) {
      allgenes2GO <- append(allgenes2GO,
         structure(list(c("GO:0008150", "GO:0003674", "GO:0005575")), names = tag))
   }
}

Using differential expression p-values

Building the topGO object

Creation of a topGO dataset is documented in section 4 of topGO’s the user manual: https://bioconductor.org/packages/release/bioc/vignettes/topGO/inst/doc/topGO.pdf. I need to use the new command, and fill up the slots. The annot object must be a function that compiles “a list of GO terms such that each element in the list is a character vector containing all the gene identifiers that are mapped to the respective GO term.” There are several options, that you can check using help(annFUN.gene2GO), for example. The annFUN.gene2GO requires a gene2GO argument, which is the list of gene-to-GO terms I made before. The geneSelectionFun object is a function that selects the interesting (most significant) genes. It is required to perform Fisher’s exact test. The nodeSize is used to prune the GO hierarchy from the terms which have less than n annotated genes.

I generate three datasets, to analyse each of the three ontologies.

selection <- function(allScores) {return(allScores < 0.01)}
GOdata.BP <- new('topGOdata',
   description = 'Differential expression among Brachionus plicatilis populations.',
   ontology = 'BP', 
   allGenes = PValues,
   annot = annFUN.gene2GO,
   gene2GO = allgenes2GO,
   geneSelectionFun = selection,
   nodeSize = 5)

GOdata.MF <- new('topGOdata',
   description = 'Differential expression among Brachionus plicatilis populations.',
   ontology = 'MF',
   allGenes = PValues,
   annot = annFUN.gene2GO,
   gene2GO = allgenes2GO,
   geneSelectionFun = selection,
   nodeSize = 5)

GOdata.CC <- new('topGOdata',
   description = 'Differential expression among Brachionus plicatilis populations.',
   ontology = 'CC',
   allGenes = PValues,
   annot = annFUN.gene2GO,
   gene2GO = allgenes2GO,
   geneSelectionFun = selection,
   nodeSize = 5)

DataSummary <- data.frame(ontology = c('BP', 'MF', 'CC'),
   Num_Genes = sapply(list(GOdata.BP, GOdata.MF, GOdata.CC), numGenes),
   Num_GO_terms = sapply(list(GOdata.BP, GOdata.MF, GOdata.CC), function(x) length(usedGO(x))))
kable(DataSummary, caption='Number of feasible genes or transcripts and number of GO terms used in each data set.')
rm(DataSummary)
Number of feasible genes or transcripts and number of GO terms used in each data set.
ontology Num_Genes Num_GO_terms
BP 26235 1582
MF 30580 791
CC 21616 390

Running the tests

There are more than one way to test for enrichment. Something that took me a while to understand is that not only there are different test statistics (Fisher’s exact test, Kolmogorov-Smirnov, and others) but also different algorithms: classic, elim, weight… The algorithms are ways to deal with the dependence structure among GO terms due to topology. Some algorithms are compatible with all statistics implemented in topGO. But the weight and the parentchild algorithms are only applicable to Fisher’s exact test. I am not interested in the classic algorithm, which treats GO nodes as independent, and therefore produces an excess of significant results. I will not use the Fisher’s exact test, because it dependes on an arbitrary threshold of significance on non-adjusted p-values.

BP.elim     <- runTest(GOdata.BP, algorithm = "elim",     statistic = "ks")
BP.weight01 <- runTest(GOdata.BP, algorithm = "weight01", statistic = "ks")
BP.lea      <- runTest(GOdata.BP, algorithm = "lea",      statistic = "ks")
MF.elim     <- runTest(GOdata.MF, algorithm = "elim",     statistic = "ks")
MF.weight01 <- runTest(GOdata.MF, algorithm = "weight01", statistic = "ks")
MF.lea      <- runTest(GOdata.MF, algorithm = "lea",      statistic = "ks")
CC.elim     <- runTest(GOdata.CC, algorithm = "elim",     statistic = "ks")
CC.weight01 <- runTest(GOdata.CC, algorithm = "weight01", statistic = "ks")
CC.lea      <- runTest(GOdata.CC, algorithm = "lea",      statistic = "ks")

ResultsSummary <- data.frame(ontology = rep(c("BP", "MF", "CC"), each = 3),
   algorithm = rep(c("elim", "weight01", "lea"), 3),
   TermsTested = sapply(list(BP.elim, BP.weight01, BP.lea, MF.elim, MF.weight01, MF.lea, CC.elim, CC.weight01, CC.lea), function(X) length(score(X))),
   Significant = sapply(list(BP.elim, BP.weight01, BP.lea, MF.elim, MF.weight01, MF.lea, CC.elim, CC.weight01, CC.lea), function(X) sum(score(X) < 0.01)))

kable(ResultsSummary, caption="Number of non-trivial terms tested and those with a score (not corrected p-value) lower than 0.01.")
Number of non-trivial terms tested and those with a score (not corrected p-value) lower than 0.01.
ontology algorithm TermsTested Significant
BP elim 1582 24
BP weight01 1582 10
BP lea 1582 37
MF elim 791 10
MF weight01 791 13
MF lea 791 13
CC elim 390 2
CC weight01 390 1
CC lea 390 3
rm(ResultsSummary)

Results

The topGO package only looks for GO terms that are overrepresented among the top of the gene list (genes with lowest score, assumed to be a \(p\) value; see 2020-06-30). Thus, underrepresented GO terms are ignored, even though it could be interesting to know if among differentially expressed genes there are fewer genes annotated with certain terms than expected by chance. The way to identify those terms would be to reverse the scores. In this case, running the analysis with \(1 - p\) values would be to search for terms that are overrepresented among non-differentially expressed genes. Not worth pursuing now.

Note that, despite my initial confusion (see previous commits), it is not true that significant terms with fewer significant genes than expected are significantly underrepresented terms. It is just possible that a term is significant because of the overall distribution of scores of genes annotated with that term, even if none of those genes (or just fewer than expected) are significant. This is because of the use of Kolmogorov-Smirnov test, instead of the Fisher’s exact test. Note that even though no hard threshold is necessary in the K-S test, GenTable() produces a summary table with number of annotated and significant genes for every significant GO term. That number of significant term is determined by the selection() function passed to the topGO object, only for display purposes unless using Fisher’s exact test.

Biological process

orderedTerms <- names(sort(score(BP.weight01)))
significant.weight01 <- score(BP.weight01)[orderedTerms] <= 0.01
significant.lea      <- score(BP.lea)[orderedTerms] <= 0.01
significant.elim     <- score(BP.elim)[orderedTerms] <= 0.01
sigTerms <- orderedTerms[significant.weight01 & significant.lea & significant.elim]
# sigTerms gets updated, and is already used in the code, but I need a permanent copy for later.
BP.pvalue.sigTerms <- sigTerms

BP.all <- GenTable(GOdata.BP, elim=BP.elim, weight01=BP.weight01, lea=BP.lea,
                   orderBy="weight01", ranksOf="elim", topNodes=sum(significant.elim))
write.table(BP.all, file=paste(TAG, VAR, 'BPsummary.tsv', sep='/'),
            quote=FALSE, sep='\t', row.names=FALSE)

kable(BP.all,
   caption = "Most over-represented terms of the Biological Process ontology.")
Most over-represented terms of the Biological Process ontology.
GO.ID Term Annotated Significant Expected Rank in elim elim weight01 lea
GO:0001522 pseudouridine synthesis 31 2 0.30 2 0.0014 0.0014 0.00143
GO:0046854 phosphatidylinositol phosphorylation 45 0 0.44 4 0.0027 0.0027 0.00272
GO:0006400 tRNA modification 58 1 0.57 45 0.0163 0.0029 0.01633
GO:0050684 regulation of mRNA processing 15 1 0.15 12 0.0059 0.0038 0.00592
GO:0030513 positive regulation of BMP signaling pat… 9 0 0.09 8 0.0052 0.0052 0.00525
GO:0060271 cilium assembly 73 1 0.72 7 0.0049 0.0068 0.00038
GO:0016579 protein deubiquitination 90 2 0.88 14 0.0070 0.0070 0.00696
GO:0048015 phosphatidylinositol-mediated signaling 40 0 0.39 20 0.0081 0.0081 0.00806
GO:0070286 axonemal dynein complex assembly 9 0 0.09 21 0.0082 0.0082 0.00823
GO:0055072 iron ion homeostasis 21 0 0.21 622 0.4004 0.0094 0.40036
GO:0007411 axon guidance 11 0 0.11 27 0.0118 0.0118 0.01182
GO:0006418 tRNA aminoacylation for protein translat… 87 1 0.85 78 0.0333 0.0124 0.03332
GO:0015937 coenzyme A biosynthetic process 8 0 0.08 41 0.0147 0.0147 0.01467
GO:0007275 multicellular organism development 85 0 0.83 108 0.0469 0.0155 0.00208
GO:0046434 organophosphate catabolic process 52 0 0.51 893 0.5923 0.0158 0.59232
GO:0060828 regulation of canonical Wnt signaling pa… 5 1 0.05 46 0.0168 0.0168 0.01683
GO:0050953 sensory perception of light stimulus 11 0 0.11 846 0.5527 0.0172 0.55272
GO:0006511 ubiquitin-dependent protein catabolic pr… 204 2 2.00 325 0.2086 0.0178 0.20864
GO:0032543 mitochondrial translation 5 0 0.05 52 0.0186 0.0186 0.01859
GO:0048017 inositol lipid-mediated signaling 50 0 0.49 53 0.0188 0.0188 0.00241
GO:0006488 dolichol-linked oligosaccharide biosynth… 11 0 0.11 54 0.0189 0.0189 0.01886
GO:0006654 phosphatidic acid biosynthetic process 10 0 0.10 55 0.0189 0.0189 0.01887
GO:0007040 lysosome organization 8 0 0.08 58 0.0193 0.0193 0.01933
GO:0006030 chitin metabolic process 103 0 1.01 142 0.0681 0.0222 0.06805
vennDiagram(vennCounts(cbind(weight01=significant.weight01,
                             lea=significant.lea,
                             elim=significant.elim)))

kable(data.frame(Term = Term(GOTERM[sigTerms]),
                 Definition = Definition(GOTERM[sigTerms]),
                 PValue=score(BP.weight01)[sigTerms]),
      caption = paste('Biological process terms significantly associated with',
                      VAR, 'according to all 3 algorithms', sep=' '))
Biological process terms significantly associated with hatching according to all 3 algorithms
Term Definition PValue
GO:0001522 pseudouridine synthesis The intramolecular conversion of uridine to pseudouridine within an RNA molecule. This posttranscriptional base modification occurs in tRNA, rRNA, and snRNAs. 0.0014295
GO:0046854 phosphatidylinositol phosphorylation The process of introducing one or more phosphate groups into a phosphatidylinositol, any glycerophosphoinositol having one phosphatidyl group esterified to one of the hydroxy groups of inositol. 0.0027221
GO:0050684 regulation of mRNA processing Any process that modulates the frequency, rate or extent of mRNA processing, those processes involved in the conversion of a primary mRNA transcript into a mature mRNA prior to its translation into polypeptide. 0.0038049
GO:0030513 positive regulation of BMP signaling pathway Any process that activates or increases the frequency, rate or extent of BMP signaling pathway activity. 0.0052479
GO:0060271 cilium assembly The assembly of a cilium, a specialized eukaryotic organelle that consists of a filiform extrusion of the cell surface. Each cilium is bounded by an extrusion of the cytoplasmic membrane, and contains a regular longitudinal array of microtubules, anchored basally in a centriole. 0.0067752
GO:0016579 protein deubiquitination The removal of one or more ubiquitin groups from a protein. 0.0069571
GO:0048015 phosphatidylinositol-mediated signaling A series of molecular signals in which a cell uses a phosphatidylinositol-mediated signaling to convert a signal into a response. Phosphatidylinositols include phosphatidylinositol (PtdIns) and its phosphorylated derivatives. 0.0080632
GO:0070286 axonemal dynein complex assembly The aggregation, arrangement and bonding together of a set of components to form an axonemal dynein complex, a dynein complex found in eukaryotic cilia and flagella, in which the motor domain heads interact with adjacent microtubules to generate a sliding force which is converted to a bending motion. 0.0082319

I think the GO graph is useful to see the relationship among the significant terms. But too large graphs are impossible to read. I don’t know how to split the graph in meaningful subgraphs.

showSigOfNodes(GOdata.BP, score(BP.weight01),
               firstSigNodes = sum(significant.elim),
               wantedNodes = sigTerms)

## $dag
## A graphNEL graph with directed edges
## Number of Nodes = 276 
## Number of Edges = 533 
## 
## $complete.dag
## [1] "A graph with 276 nodes."

This is just a example of the most significant GO term:

showGroupDensity(GOdata.BP, orderedTerms[1], rm.one=FALSE)

showGroupDensity(GOdata.BP, orderedTerms[2], rm.one=FALSE)

showGroupDensity(GOdata.BP, orderedTerms[3], rm.one=FALSE)

Molecular function

orderedTerms <- names(sort(score(MF.weight01)))
significant.weight01 <- score(MF.weight01)[orderedTerms] <= 0.01
significant.lea      <- score(MF.lea)[orderedTerms] <= 0.01
significant.elim     <- score(MF.elim)[orderedTerms] <= 0.01
sigTerms <- orderedTerms[significant.weight01 & significant.lea & significant.elim]
# sigTerms gets updated, and is already used in the code, but I need a permanent copy for later.
MF.pvalue.sigTerms <- sigTerms

MF.all <- GenTable(GOdata.MF, elim=MF.elim, weight01=MF.weight01, lea=MF.lea,
                   orderBy="weight01", ranksOf="elim", topNodes=sum(significant.elim))
write.table(MF.all, file=paste(TAG, VAR, 'MFsummary.tsv', sep='/'),
            quote=FALSE, sep='\t', row.names=FALSE)

kable(MF.all,
   caption = "Most over-represented terms of the Molecular Function ontology.")
Most over-represented terms of the Molecular Function ontology.
GO.ID Term Annotated Significant Expected Rank in elim elim weight01 lea
GO:0016747 transferase activity, transferring acyl … 158 3 1.52 149 0.18300 0.00036 0.20455
GO:0140101 catalytic activity, acting on a tRNA 139 1 1.34 2 0.00096 0.00041 0.00096
GO:0035091 phosphatidylinositol binding 109 3 1.05 1 0.00025 0.00045 0.00025
GO:0004812 aminoacyl-tRNA ligase activity 90 1 0.87 17 0.01613 0.00098 0.01613
GO:0005509 calcium ion binding 638 6 6.13 3 0.00156 0.00156 0.00156
GO:0004594 pantothenate kinase activity 5 0 0.05 4 0.00283 0.00283 0.00283
GO:0009982 pseudouridine synthase activity 27 0 0.26 5 0.00300 0.00300 0.00300
GO:0008417 fucosyltransferase activity 42 0 0.40 6 0.00422 0.00422 0.00422
GO:0060090 molecular adaptor activity 45 2 0.43 16 0.01539 0.00449 0.01539
GO:0005515 protein binding 4270 46 41.05 28 0.02495 0.00528 0.01108
vennDiagram(vennCounts(cbind(weight01=significant.weight01,
                             lea=significant.lea,
                             elim=significant.elim)))

kable(data.frame(Term=Term(GOTERM[sigTerms]),
                 Definition=Definition(GOTERM[sigTerms]),
                 PValue=score(MF.weight01)[sigTerms]),
      caption = paste('Molecular function terms significantly associated with', VAR,
                      'according to all 3 algorithms', sep=' '))
Molecular function terms significantly associated with hatching according to all 3 algorithms
Term Definition PValue
GO:0140101 catalytic activity, acting on a tRNA NA 0.0004062
GO:0035091 phosphatidylinositol binding Interacting selectively and non-covalently with any inositol-containing glycerophospholipid, i.e. phosphatidylinositol (PtdIns) and its phosphorylated derivatives. 0.0004529
GO:0005509 calcium ion binding Interacting selectively and non-covalently with calcium ions (Ca2+). 0.0015580
GO:0004594 pantothenate kinase activity Catalysis of the reaction: ATP + pantothenate = ADP + D-4’-phosphopantothenate. 0.0028317
GO:0009982 pseudouridine synthase activity Catalysis of the reaction: RNA uridine = RNA pseudouridine. Conversion of uridine in an RNA molecule to pseudouridine by rotation of the C1’-N-1 glycosidic bond of uridine in RNA to a C1’-C5. 0.0030029
GO:0008417 fucosyltransferase activity Catalysis of the transfer of a fucosyl group to an acceptor molecule, typically another carbohydrate or a lipid. 0.0042250
GO:0004630 phospholipase D activity Catalysis of the reaction: a phosphatidylcholine + H2O = choline + a phosphatidate. 0.0059711
showSigOfNodes(GOdata.MF, score(MF.weight01),
               firstSigNodes = sum(significant.elim),
               wantedNodes = sigTerms)

## $dag
## A graphNEL graph with directed edges
## Number of Nodes = 38 
## Number of Edges = 41 
## 
## $complete.dag
## [1] "A graph with 38 nodes."
showGroupDensity(GOdata.MF, orderedTerms[1], rm.one=FALSE)

showGroupDensity(GOdata.MF, orderedTerms[2], rm.one=FALSE)

showGroupDensity(GOdata.MF, orderedTerms[3], rm.one=FALSE)

Cellular component

orderedTerms <- names(sort(score(CC.weight01)))
significant.weight01 <- score(CC.weight01)[orderedTerms] <= 0.01
significant.lea      <- score(CC.lea)[orderedTerms] <= 0.01
significant.elim     <- score(CC.elim)[orderedTerms] <= 0.01
sigTerms <- orderedTerms[significant.weight01 & significant.lea & significant.elim]
# sigTerms gets updated, and is already used in the code, but I need a permanent copy for later.
CC.pvalue.sigTerms <- sigTerms

CC.all <- GenTable(GOdata.CC, elim=CC.elim, weight01=CC.weight01, lea=CC.lea,
                   orderBy="weight01", ranksOf="elim", topNodes=sum(significant.elim))

write.table(CC.all, file=paste(TAG, VAR, 'CCsummary.tsv', sep='/'),
            quote=FALSE, sep='\t', row.names=FALSE)

kable(CC.all,
   caption = "Most over-represented terms of the Cellular Component ontology.")
Most over-represented terms of the Cellular Component ontology.
GO.ID Term Annotated Significant Expected Rank in elim elim weight01 lea
GO:1902555 endoribonuclease complex 10 0 0.09 10 0.020 0.004 0.020
GO:0005576 extracellular region 197 2 1.72 15 0.026 0.014 0.026
vennDiagram(vennCounts(cbind(weight01=significant.weight01,
                             lea=significant.lea,
                             elim=significant.elim)))

kable(data.frame(Term=Term(GOTERM[sigTerms]),
                 Definition=Definition(GOTERM[sigTerms]),
                 PValue=score(CC.weight01)[sigTerms]),
      caption = paste('Cellular component terms significantly associated with', VAR,
                      'according to all 3 algorithms', sep=' '))
Cellular component terms significantly associated with hatching according to all 3 algorithms
Term Definition PValue
showSigOfNodes(GOdata.CC, score(CC.weight01),
               firstSigNodes = sum(significant.elim),
               wantedNodes = sigTerms)

## $dag
## A graphNEL graph with directed edges
## Number of Nodes = 6 
## Number of Edges = 5 
## 
## $complete.dag
## [1] "A graph with 6 nodes."
showGroupDensity(GOdata.CC, orderedTerms[1], rm.one=FALSE)

showGroupDensity(GOdata.CC, orderedTerms[2], rm.one=FALSE)

showGroupDensity(GOdata.CC, orderedTerms[3], rm.one=FALSE)

Using the portion of variance explained by hatching

Building the topGO object

I need to generate the topGO objects again, using the alternative gene ordering, based on the proportion of expression-level variance explained by hatching. I miss a way to inform the topGOdata object that the score in this case is reversed, with respect to \(p\)-values: the higher it is, the more differentially expressed the gene is. To make sure that GO terms are tested in the same way than when using p-values, I will just reverse the proportion of variance explained by hatching to its complement. Taking this into account, the subset of interesting genes (selection() function) must be defined as the lowest 10% scores, which are the 10% genes with largest portion of variance explained by hatching.

selection <- function(allScores) {return(allScores <= quantile(allScores, probs = 0.10))}
GOdataVar.BP <- new('topGOdata',
   description = 'Differential expression among Brachionus plicatilis populations.',
   ontology = 'BP', 
   allGenes = 1.0 - Variance,
   annot = annFUN.gene2GO,
   gene2GO = allgenes2GO,
   geneSelectionFun = selection,
   nodeSize = 5)

GOdataVar.MF <- new('topGOdata',
   description = 'Differential expression among Brachionus plicatilis populations.',
   ontology = 'MF',
   allGenes = 1.0 - Variance,
   annot = annFUN.gene2GO,
   gene2GO = allgenes2GO,
   geneSelectionFun = selection,
   nodeSize = 5)

GOdataVar.CC <- new('topGOdata',
   description = 'Differential expression among Brachionus plicatilis populations.',
   ontology = 'CC',
   allGenes = 1.0 - Variance,
   annot = annFUN.gene2GO,
   gene2GO = allgenes2GO,
   geneSelectionFun = selection,
   nodeSize = 5)

library(knitr)
DataSummary <- data.frame(ontology = c('BP', 'MF', 'CC'),
   Num_Genes = sapply(list(GOdataVar.BP, GOdataVar.MF, GOdataVar.CC), numGenes),
   Num_GO_terms = sapply(list(GOdataVar.BP, GOdataVar.MF, GOdataVar.CC), function(x) length(usedGO(x))))
kable(DataSummary, caption='Number of feasible genes or transcripts and number of GO terms used in each data set.')
rm(DataSummary)
Number of feasible genes or transcripts and number of GO terms used in each data set.
ontology Num_Genes Num_GO_terms
BP 26235 1582
MF 30580 791
CC 21616 390

Running the tests

BPvar.elim     <- runTest(GOdataVar.BP, algorithm = "elim",     statistic = "ks")
BPvar.weight01 <- runTest(GOdataVar.BP, algorithm = "weight01", statistic = "ks")
BPvar.lea      <- runTest(GOdataVar.BP, algorithm = "lea",      statistic = "ks")
MFvar.elim     <- runTest(GOdataVar.MF, algorithm = "elim",     statistic = "ks")
MFvar.weight01 <- runTest(GOdataVar.MF, algorithm = "weight01", statistic = "ks")
MFvar.lea      <- runTest(GOdataVar.MF, algorithm = "lea",      statistic = "ks")
CCvar.elim     <- runTest(GOdataVar.CC, algorithm = "elim",     statistic = "ks")
CCvar.weight01 <- runTest(GOdataVar.CC, algorithm = "weight01", statistic = "ks")
CCvar.lea      <- runTest(GOdataVar.CC, algorithm = "lea",      statistic = "ks")

ResultsSummary <- data.frame(ontology = rep(c("BP", "MF", "CC"), each = 3),
   algorithm = rep(c("elim", "weight01", "lea"), 3),
   TermsTested = sapply(list(BPvar.elim, BPvar.weight01, BPvar.lea, MFvar.elim, MFvar.weight01, MFvar.lea, CCvar.elim, CCvar.weight01, CCvar.lea), function(X) length(score(X))),
   Significant = sapply(list(BPvar.elim, BPvar.weight01, BPvar.lea, MFvar.elim, MFvar.weight01, MFvar.lea, CCvar.elim, CCvar.weight01, CCvar.lea), function(X) sum(score(X) < 0.01)))

kable(ResultsSummary, caption="Number of non-trivial terms tested and those with a score (not corrected p-value) lower than 0.01.")
Number of non-trivial terms tested and those with a score (not corrected p-value) lower than 0.01.
ontology algorithm TermsTested Significant
BP elim 1582 33
BP weight01 1582 18
BP lea 1582 66
MF elim 791 23
MF weight01 791 20
MF lea 791 29
CC elim 390 10
CC weight01 390 3
CC lea 390 18
rm(ResultsSummary)

Results

Biological process

orderedTerms <- names(sort(score(BPvar.weight01)))
significant.weight01 <- score(BPvar.weight01)[orderedTerms] <= 0.01
significant.lea      <- score(BPvar.lea)[orderedTerms] <= 0.01
significant.elim     <- score(BPvar.elim)[orderedTerms] <= 0.01
sigTerms <- orderedTerms[significant.weight01 & significant.lea & significant.elim]
# sigTerms gets updated, and is already used in the code, but I need a permanent copy for later.
BP.variance.sigTerms <- sigTerms

BPvar.all <- GenTable(GOdataVar.BP, elim=BPvar.elim, weight01=BPvar.weight01, lea=BPvar.lea,
                      orderBy="weight01", ranksOf="elim", topNodes=sum(significant.elim))

kable(BPvar.all,
   caption = "Most over-represented terms of the Biological Process ontology.")
Most over-represented terms of the Biological Process ontology.
GO.ID Term Annotated Significant Expected Rank in elim elim weight01 lea
GO:0006511 ubiquitin-dependent protein catabolic pr… 204 33 20.21 9 0.00217 2.2e-05 0.00217
GO:0030513 positive regulation of BMP signaling pat… 9 3 0.89 1 3.6e-05 3.6e-05 3.6e-05
GO:0016579 protein deubiquitination 90 19 8.92 2 8.2e-05 8.2e-05 8.2e-05
GO:0055072 iron ion homeostasis 21 2 2.08 514 0.24052 0.00011 0.24052
GO:0007411 axon guidance 11 3 1.09 5 0.00074 0.00074 0.00074
GO:0007264 small GTPase mediated signal transductio… 228 31 22.59 10 0.00297 0.00123 0.00297
GO:0046434 organophosphate catabolic process 52 6 5.15 362 0.15812 0.00147 0.15812
GO:0032543 mitochondrial translation 5 1 0.50 8 0.00192 0.00192 0.00192
GO:0006030 chitin metabolic process 103 10 10.20 21 0.00606 0.00202 0.00606
GO:0006486 protein glycosylation 106 10 10.50 6 0.00083 0.00292 0.00083
GO:0001522 pseudouridine synthesis 31 9 3.07 13 0.00328 0.00328 0.00328
GO:0006400 tRNA modification 58 12 5.75 59 0.01725 0.00335 0.01725
GO:0006997 nucleus organization 5 3 0.50 14 0.00342 0.00342 0.00342
GO:0035307 positive regulation of protein dephospho… 5 2 0.50 18 0.00516 0.00516 0.00516
GO:0060271 cilium assembly 73 6 7.23 3 0.00039 0.00532 0.00039
GO:0046907 intracellular transport 422 60 41.81 139 0.04443 0.00660 0.00088
GO:0015937 coenzyme A biosynthetic process 8 3 0.79 24 0.00668 0.00668 0.00668
GO:0034474 U2 snRNA 3’-end processing 17 4 1.68 30 0.00915 0.00915 0.00915
GO:0006545 glycine biosynthetic process 7 0 0.69 34 0.01016 0.01016 0.01016
GO:0006397 mRNA processing 158 35 15.65 12 0.00322 0.01045 0.00322
GO:0010972 negative regulation of G2/M transition o… 5 0 0.50 37 0.01103 0.01103 0.01103
GO:0030163 protein catabolic process 241 35 23.87 7 0.00137 0.01230 5.1e-05
GO:1903047 mitotic cell cycle process 86 8 8.52 631 0.32396 0.01245 0.65910
GO:0050953 sensory perception of light stimulus 11 2 1.09 824 0.46346 0.01444 0.46346
GO:0007029 endoplasmic reticulum organization 13 2 1.29 49 0.01522 0.01522 0.01522
GO:0038007 netrin-activated signaling pathway 5 1 0.50 50 0.01527 0.01527 0.01527
GO:0034314 Arp2/3 complex-mediated actin nucleation 17 4 1.68 52 0.01539 0.01539 0.01539
GO:0035195 gene silencing by miRNA 5 1 0.50 53 0.01546 0.01546 0.01546
GO:0070286 axonemal dynein complex assembly 9 2 0.89 56 0.01634 0.01634 0.01634
GO:0046854 phosphatidylinositol phosphorylation 45 11 4.46 63 0.01918 0.01918 0.01918
GO:0006904 vesicle docking involved in exocytosis 22 6 2.18 67 0.01959 0.01959 0.01959
GO:0006418 tRNA aminoacylation for protein translat… 87 7 8.62 60 0.01839 0.01962 0.01839
GO:0051270 regulation of cellular component movemen… 5 1 0.50 68 0.01975 0.01975 0.01975
vennDiagram(vennCounts(cbind(weight01=significant.weight01,
                             lea=significant.lea,
                             elim=significant.elim)))

kable(data.frame(Term=Term(GOTERM[sigTerms]),
                 Definition=Definition(GOTERM[sigTerms]),
                 PValue=score(BPvar.weight01)[sigTerms]),
      caption = paste('Biological process terms significantly associated with', VAR,
                      'according to all 3 algorithms', sep=' '))
Biological process terms significantly associated with hatching according to all 3 algorithms
Term Definition PValue
GO:0006511 ubiquitin-dependent protein catabolic process The chemical reactions and pathways resulting in the breakdown of a protein or peptide by hydrolysis of its peptide bonds, initiated by the covalent attachment of a ubiquitin group, or multiple ubiquitin groups, to the protein. 0.0000222
GO:0030513 positive regulation of BMP signaling pathway Any process that activates or increases the frequency, rate or extent of BMP signaling pathway activity. 0.0000357
GO:0016579 protein deubiquitination The removal of one or more ubiquitin groups from a protein. 0.0000819
GO:0007411 axon guidance The chemotaxis process that directs the migration of an axon growth cone to a specific target site in response to a combination of attractive and repulsive cues. 0.0007397
GO:0007264 small GTPase mediated signal transduction Any series of molecular signals in which a small monomeric GTPase relays one or more of the signals. 0.0012329
GO:0032543 mitochondrial translation The chemical reactions and pathways resulting in the formation of a protein in a mitochondrion. This is a ribosome-mediated process in which the information in messenger RNA (mRNA) is used to specify the sequence of amino acids in the protein; the mitochondrion has its own ribosomes and transfer RNAs, and uses a genetic code that differs from the nuclear code. 0.0019207
GO:0006030 chitin metabolic process The chemical reactions and pathways involving chitin, a linear polysaccharide consisting of beta-(1->4)-linked N-acetyl-D-glucosamine residues. 0.0020236
GO:0006486 protein glycosylation A protein modification process that results in the addition of a carbohydrate or carbohydrate derivative unit to a protein amino acid, e.g. the addition of glycan chains to proteins. 0.0029192
GO:0001522 pseudouridine synthesis The intramolecular conversion of uridine to pseudouridine within an RNA molecule. This posttranscriptional base modification occurs in tRNA, rRNA, and snRNAs. 0.0032848
GO:0006997 nucleus organization A process that is carried out at the cellular level which results in the assembly, arrangement of constituent parts, or disassembly of the nucleus. 0.0034201
GO:0035307 positive regulation of protein dephosphorylation Any process that activates or increases the frequency, rate or extent of removal of phosphate groups from a protein. 0.0051619
GO:0060271 cilium assembly The assembly of a cilium, a specialized eukaryotic organelle that consists of a filiform extrusion of the cell surface. Each cilium is bounded by an extrusion of the cytoplasmic membrane, and contains a regular longitudinal array of microtubules, anchored basally in a centriole. 0.0053173
GO:0015937 coenzyme A biosynthetic process The chemical reactions and pathways resulting in the formation of coenzyme A, 3’-phosphoadenosine-(5’)diphospho(4’)pantatheine, an acyl carrier in many acylation and acyl-transfer reactions in which the intermediate is a thiol ester. 0.0066754
GO:0034474 U2 snRNA 3’-end processing Any process involved in forming the mature 3’ end of a U2 snRNA molecule. 0.0091518
showSigOfNodes(GOdataVar.BP, score(BPvar.weight01),
               firstSigNodes = sum(significant.elim),
               wantedNodes = sigTerms)

## $dag
## A graphNEL graph with directed edges
## Number of Nodes = 381 
## Number of Edges = 772 
## 
## $complete.dag
## [1] "A graph with 381 nodes."

Below I plot variance portion, but for the termo found most significant when using p-values, for comparison.

showGroupDensity(GOdataVar.BP, orderedTerms[1], rm.one=FALSE)

showGroupDensity(GOdataVar.BP, orderedTerms[2], rm.one=FALSE)

showGroupDensity(GOdataVar.BP, orderedTerms[3], rm.one=FALSE)

Molecular function

orderedTerms <- names(sort(score(MFvar.weight01)))
significant.weight01 <- score(MFvar.weight01)[orderedTerms] <= 0.01
significant.lea      <- score(MFvar.lea)[orderedTerms] <= 0.01
significant.elim     <- score(MFvar.elim)[orderedTerms] <= 0.01
sigTerms <- orderedTerms[significant.weight01 & significant.lea & significant.elim]
# sigTerms gets updated, and is already used in the code, but I need a permanent copy for later.
MF.variance.sigTerms <- sigTerms

MFvar.all <- GenTable(GOdataVar.MF, elim=MFvar.elim, weight01=MFvar.weight01, lea=MFvar.lea,
                      orderBy="weight01", ranksOf="elim", topNodes=sum(significant.elim))
kable(MFvar.all,
   caption = "Most over-represented terms of the Molecular Function ontology.")
Most over-represented terms of the Molecular Function ontology.
GO.ID Term Annotated Significant Expected Rank in elim elim weight01 lea
GO:0005515 protein binding 4270 487 418.76 4 0.00056 5.9e-05 0.00056
GO:0036459 thiol-dependent ubiquitinyl hydrolase ac… 103 23 10.10 1 3e-05 6.7e-05 3e-05
GO:0038023 signaling receptor activity 471 34 46.19 325 0.41565 0.00025 0.48567
GO:0035091 phosphatidylinositol binding 109 23 10.69 2 0.00033 0.00055 0.00033
GO:0008061 chitin binding 112 10 10.98 5 0.00061 0.00061 0.00061
GO:0004298 threonine-type endopeptidase activity 31 4 3.04 6 0.00087 0.00087 0.00087
GO:0140101 catalytic activity, acting on a tRNA 139 16 13.63 10 0.00161 0.00118 0.00161
GO:0008417 fucosyltransferase activity 42 4 4.12 8 0.00136 0.00136 0.00136
GO:0004594 pantothenate kinase activity 5 2 0.49 9 0.00142 0.00142 0.00142
GO:0003676 nucleic acid binding 2125 239 208.40 3 0.00052 0.00238 0.00052
GO:0016417 S-acyltransferase activity 6 4 0.59 11 0.00247 0.00247 0.00247
GO:0016742 hydroxymethyl-, formyl- and related tran… 9 3 0.88 13 0.00276 0.00276 0.00276
GO:0016887 ATPase activity 327 43 32.07 212 0.24838 0.00715 0.24838
GO:0004812 aminoacyl-tRNA ligase activity 90 7 8.83 42 0.03055 0.00828 0.03055
GO:0060090 molecular adaptor activity 45 13 4.41 7 0.00121 0.00840 0.00121
GO:0005220 inositol 1,4,5-trisphosphate-sensitive c… 15 2 1.47 18 0.00879 0.00879 0.00879
GO:0070679 inositol 1,4,5 trisphosphate binding 15 2 1.47 19 0.00879 0.00879 0.00879
GO:0005516 calmodulin binding 52 10 5.10 20 0.00948 0.00948 0.00948
GO:0003993 acid phosphatase activity 10 1 0.98 22 0.00974 0.00974 0.00974
GO:0004312 fatty acid synthase activity 6 3 0.59 23 0.00998 0.00998 0.00998
GO:0046527 glucosyltransferase activity 17 4 1.67 16 0.00693 0.01026 0.00693
GO:0005102 signaling receptor binding 47 7 4.61 112 0.13378 0.01048 0.13378
GO:0016748 succinyltransferase activity 5 3 0.49 24 0.01065 0.01065 0.01065
vennDiagram(vennCounts(cbind(weight01=significant.weight01,
                             lea=significant.lea,
                             elim=significant.elim)))

kable(data.frame(Term=Term(GOTERM[sigTerms]),
                 Definition=Definition(GOTERM[sigTerms]),
                 PValue=score(MFvar.weight01)[sigTerms]),
      caption = paste('Molecular functions terms significantly associated with', VAR,
                      'according to all 3 algorithms', sep=' '))
Molecular functions terms significantly associated with hatching according to all 3 algorithms
Term Definition PValue
GO:0005515 protein binding Interacting selectively and non-covalently with any protein or protein complex (a complex of two or more proteins that may include other nonprotein molecules). 0.0000586
GO:0036459 thiol-dependent ubiquitinyl hydrolase activity Catalysis of the thiol-dependent hydrolysis of an ester, thioester, amide, peptide or isopeptide bond formed by the C-terminal glycine of ubiquitin. 0.0000670
GO:0035091 phosphatidylinositol binding Interacting selectively and non-covalently with any inositol-containing glycerophospholipid, i.e. phosphatidylinositol (PtdIns) and its phosphorylated derivatives. 0.0005496
GO:0008061 chitin binding Interacting selectively and non-covalently with chitin, a linear polysaccharide consisting of beta-(1->4)-linked N-acetyl-D-glucosamine residues. 0.0006105
GO:0004298 threonine-type endopeptidase activity Catalysis of the hydrolysis of internal peptide bonds in a polypeptide chain by a mechanism in which the hydroxyl group of a threonine residue at the active center acts as a nucleophile. 0.0008677
GO:0140101 catalytic activity, acting on a tRNA NA 0.0011815
GO:0008417 fucosyltransferase activity Catalysis of the transfer of a fucosyl group to an acceptor molecule, typically another carbohydrate or a lipid. 0.0013571
GO:0004594 pantothenate kinase activity Catalysis of the reaction: ATP + pantothenate = ADP + D-4’-phosphopantothenate. 0.0014241
GO:0003676 nucleic acid binding Interacting selectively and non-covalently with any nucleic acid. 0.0023808
GO:0016417 S-acyltransferase activity Catalysis of the transfer of an acyl group to a sulfur atom on the acceptor molecule. 0.0024717
GO:0016742 hydroxymethyl-, formyl- and related transferase activity Catalysis of the transfer of a hydroxymethyl- or formyl group from one compound (donor) to another (acceptor). 0.0027562
GO:0060090 molecular adaptor activity The binding activity of a molecule that brings together two or more molecules through a selective, non-covalent, often stoichiometric interaction, permitting those molecules to function in a coordinated way. 0.0083968
GO:0005220 inositol 1,4,5-trisphosphate-sensitive calcium-release channel activity Enables the transmembrane transfer of a calcium ion by a channel that opens when inositol 1,4,5-trisphosphate (IP3) has been bound by the channel complex or one of its constituent parts. 0.0087864
GO:0070679 inositol 1,4,5 trisphosphate binding Interacting selectively and non-covalently with inositol 1,4,5 trisphosphate. 0.0087864
GO:0005516 calmodulin binding Interacting selectively and non-covalently with calmodulin, a calcium-binding protein with many roles, both in the calcium-bound and calcium-free states. 0.0094780
GO:0003993 acid phosphatase activity Catalysis of the reaction: an orthophosphoric monoester + H2O = an alcohol + phosphate, with an acid pH optimum. 0.0097424
GO:0004312 fatty acid synthase activity Catalysis of the reaction: acetyl-CoA + n malonyl-CoA + 2n NADPH + 2n H+ = long-chain fatty acid + n+1 CoA + n CO2 + 2n NADP+. 0.0099792
showSigOfNodes(GOdataVar.MF, score(MFvar.weight01),
               firstSigNodes = sum(significant.elim),
               wantedNodes = sigTerms)

## $dag
## A graphNEL graph with directed edges
## Number of Nodes = 87 
## Number of Edges = 107 
## 
## $complete.dag
## [1] "A graph with 87 nodes."

I plot variance portion, but for the term found most significant when using p-values, for comparison.

showGroupDensity(GOdataVar.MF, orderedTerms[1], rm.one=FALSE)

showGroupDensity(GOdataVar.MF, orderedTerms[2], rm.one=FALSE)

showGroupDensity(GOdataVar.MF, orderedTerms[3], rm.one=FALSE)

Cellular component

orderedTerms <- names(sort(score(CCvar.weight01)))
significant.weight01 <- score(CCvar.weight01)[orderedTerms] <= 0.01
significant.lea      <- score(CCvar.lea)[orderedTerms] <= 0.01
significant.elim     <- score(CCvar.elim)[orderedTerms] <= 0.01
sigTerms <- orderedTerms[significant.weight01 & significant.lea & significant.elim]
if (length(sigTerms) == 0) {
   sigTerms <- orderedTerms[significant.lea & significant.elim]
}
# sigTerms gets updated, and is already used in the code, but I need a permanent copy for later.
CC.variance.sigTerms <- sigTerms

CCvar.all <- GenTable(GOdataVar.CC, elim=CCvar.elim, weight01=CCvar.weight01, lea=CCvar.lea,
                      orderBy="weight01", ranksOf="elim", topNodes=max(sum(significant.elim), 10))

kable(CCvar.all,
   caption = "Most over-represented terms of the Cellular Component ontology.")
Most over-represented terms of the Cellular Component ontology.
GO.ID Term Annotated Significant Expected Rank in elim elim weight01 lea
GO:0016272 prefoldin complex 7 0 0.66 6 0.0053 0.0053 0.00531
GO:0005737 cytoplasm 1176 121 110.82 22 0.0242 0.0073 0.00071
GO:0045239 tricarboxylic acid cycle enzyme complex 5 3 0.47 10 0.0100 0.0100 0.00996
GO:0005815 microtubule organizing center 96 15 9.05 28 0.0286 0.0105 0.15049
GO:1902555 endoribonuclease complex 10 3 0.94 46 0.0602 0.0105 0.06022
GO:0005761 mitochondrial ribosome 18 4 1.70 3 0.0035 0.0111 0.00347
GO:0005576 extracellular region 197 21 18.56 27 0.0282 0.0130 0.02818
GO:0030015 CCR4-NOT core complex 15 3 1.41 13 0.0153 0.0153 0.01528
GO:0005802 trans-Golgi network 12 1 1.13 14 0.0158 0.0158 0.01577
GO:0031981 nuclear lumen 280 37 26.39 24 0.0253 0.0162 0.02534
vennDiagram(vennCounts(cbind(weight01=significant.weight01,
                             lea=significant.lea,
                             elim=significant.elim)))

kable(data.frame(Term=Term(GOTERM[sigTerms]),
                 Definition=Definition(GOTERM[sigTerms]),
                 PValue=score(CCvar.weight01)[sigTerms]),
      caption = paste('Cellular component terms significantly associated with', VAR,
                      'according to all 3 algorithms', sep=' '))
Cellular component terms significantly associated with hatching according to all 3 algorithms
Term Definition PValue
GO:0016272 prefoldin complex A multisubunit chaperone that is capable of delivering unfolded proteins to cytosolic chaperonin, which it acts as a cofactor for. In humans, the complex is a heterohexamer of two PFD-alpha and four PFD-beta type subunits. In Saccharomyces cerevisiae, it also acts in the nucleus to regulate the rate of elongation by RNA polymerase II via a direct effect on histone dynamics. 0.0053057
GO:0045239 tricarboxylic acid cycle enzyme complex Any of the heteromeric enzymes that act in the TCA cycle. 0.0099588
showSigOfNodes(GOdataVar.CC, score(CCvar.weight01),
               firstSigNodes = sum(significant.weight01),
               wantedNodes = sigTerms)

## $dag
## A graphNEL graph with directed edges
## Number of Nodes = 10 
## Number of Edges = 13 
## 
## $complete.dag
## [1] "A graph with 10 nodes."

For comparison, I plot the distribution of variance portion for the CC term found most significant when using p-values.

showGroupDensity(GOdataVar.CC, orderedTerms[1], rm.one=FALSE)

showGroupDensity(GOdataVar.CC, orderedTerms[2], rm.one=FALSE)

showGroupDensity(GOdataVar.CC, orderedTerms[3], rm.one=FALSE)

Comparison between the two ordering of genes.

Biological process

allTerms <- usedGO(GOdata.BP)
vennDiagram(vennCounts(cbind(PValue   = allTerms %in% BP.pvalue.sigTerms,
                             Variance = allTerms %in% BP.variance.sigTerms)))

ggplot(data = data.frame(PValue   = rank(score(BP.weight01))[allTerms],
                         Variance = rank(score(BPvar.weight01))[allTerms]),
       mapping = aes(x = PValue, y = Variance)) +
   geom_point() + geom_smooth(method='lm') + xlab('Using gene p-values') +
   ylab('Using portion of explained variance') +
   ggtitle('Ordering of BP terms by significance')

Molecular function

allTerms <- usedGO(GOdata.MF)
vennDiagram(vennCounts(cbind(PValue   = allTerms %in% MF.pvalue.sigTerms,
                             Variance = allTerms %in% MF.variance.sigTerms)))

ggplot(data = data.frame(PValue   = rank(score(MF.weight01))[allTerms],
                         Variance = rank(score(MFvar.weight01))[allTerms]),
       mapping = aes(x = PValue, y = Variance)) +
   geom_point() + geom_smooth(methdo='lm') + xlab('Using gene p-values') +
   ylab('Using portion of explained variance') +
   ggtitle('Ordering of MF terms by significance')
## Warning: Ignoring unknown parameters: methdo

Cellular component

allTerms <- usedGO(GOdata.CC)
vennDiagram(vennCounts(cbind(PValue   = allTerms %in% CC.pvalue.sigTerms,
                             Variance = allTerms %in% CC.variance.sigTerms)))

ggplot(data = data.frame(PValue   = rank(score(CC.weight01))[allTerms],
                         Variance = rank(score(CCvar.weight01))[allTerms]),
       mapping = aes(x = PValue, y = Variance)) +
   geom_point() + geom_smooth(method='lm') + xlab('Using gene p-values') +
   ylab('Using portion of explained variance') +
   ggtitle('Ordering of CC terms by significance')

Session info

I save the main variables to be able to load them in a new R session later.

save(allgenes2GO,
     GOdata.BP, BP.elim, BP.weight01, BP.lea, BP.pvalue.sigTerms,
     GOdata.MF, MF.elim, MF.weight01, MF.lea, MF.pvalue.sigTerms,
     GOdata.CC, CC.elim, CC.weight01, CC.lea, CC.pvalue.sigTerms,
     GOdataVar.BP, BPvar.elim, BPvar.weight01, BPvar.lea, BP.variance.sigTerms,
     GOdataVar.MF, MFvar.elim, MFvar.weight01, MFvar.lea, MF.variance.sigTerms,
     GOdataVar.CC, CCvar.elim, CCvar.weight01, CCvar.lea, CC.variance.sigTerms,
     file = paste('Enrichment', TAG, VAR, 'RData', sep='.'))
sessionInfo()
## R version 3.6.3 (2020-02-29)
## Platform: x86_64-pc-linux-gnu (64-bit)
## Running under: Ubuntu 18.04.5 LTS
## 
## Matrix products: default
## BLAS:   /usr/lib/x86_64-linux-gnu/blas/libblas.so.3.7.1
## LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.7.1
## 
## locale:
##  [1] LC_CTYPE=ca_ES.UTF-8       LC_NUMERIC=C              
##  [3] LC_TIME=ca_ES.UTF-8        LC_COLLATE=ca_ES.UTF-8    
##  [5] LC_MONETARY=ca_ES.UTF-8    LC_MESSAGES=ca_ES.UTF-8   
##  [7] LC_PAPER=ca_ES.UTF-8       LC_NAME=C                 
##  [9] LC_ADDRESS=C               LC_TELEPHONE=C            
## [11] LC_MEASUREMENT=ca_ES.UTF-8 LC_IDENTIFICATION=C       
## 
## attached base packages:
##  [1] grid      stats4    parallel  stats     graphics  grDevices utils    
##  [8] datasets  methods   base     
## 
## other attached packages:
##  [1] Rgraphviz_2.30.0     ggplot2_3.3.0        limma_3.42.2        
##  [4] knitr_1.28           topGO_2.38.1         SparseM_1.78        
##  [7] GO.db_3.10.0         AnnotationDbi_1.48.0 IRanges_2.20.2      
## [10] S4Vectors_0.24.3     Biobase_2.46.0       graph_1.64.0        
## [13] BiocGenerics_0.32.0 
## 
## loaded via a namespace (and not attached):
##  [1] tidyselect_1.0.0   xfun_0.12          purrr_0.3.3        splines_3.6.3     
##  [5] lattice_0.20-41    colorspace_1.4-1   vctrs_0.2.4        htmltools_0.4.0   
##  [9] yaml_2.2.1         mgcv_1.8-33        blob_1.2.1         rlang_0.4.5       
## [13] pillar_1.4.3       glue_1.4.0         withr_2.1.2        DBI_1.1.0         
## [17] bit64_0.9-7        matrixStats_0.55.0 lifecycle_0.2.0    stringr_1.4.0     
## [21] munsell_0.5.0      gtable_0.3.0       memoise_1.1.0      evaluate_0.14     
## [25] labeling_0.3       fansi_0.4.1        highr_0.8          Rcpp_1.0.4        
## [29] scales_1.1.0       farver_2.0.3       bit_1.1-15.2       digest_0.6.25     
## [33] stringi_1.4.6      dplyr_0.8.5        cli_2.0.2          tools_3.6.3       
## [37] magrittr_1.5       RSQLite_2.2.0      tibble_3.0.0       crayon_1.3.4      
## [41] pkgconfig_2.0.3    Matrix_1.2-18      ellipsis_0.3.0     assertthat_0.2.1  
## [45] rmarkdown_2.1      R6_2.4.1           nlme_3.1-149       compiler_3.6.3